23 research outputs found

    Detecting and Preventing Hallucinations in Large Vision Language Models

    Full text link
    Instruction tuned Large Vision Language Models (LVLMs) have made significant advancements in generalizing across a diverse set of multimodal tasks, especially for Visual Question Answering (VQA). However, generating detailed responses that are visually grounded is still a challenging task for these models. We find that even the current state-of-the-art LVLMs (InstructBLIP) still contain a staggering 30 percent of hallucinatory text in the form of non-existent objects, unfaithful descriptions, and inaccurate relationships. To address this, we introduce M-HalDetect, a {M}ultimodal {Hal}lucination {Detect}ion Dataset that can be used to train and benchmark models for hallucination detection and prevention. M-HalDetect consists of 16k fine-grained labels on VQA examples, making it the first comprehensive multi-modal hallucination detection dataset for detailed image descriptions. Unlike previous work that only consider object hallucination, we additionally annotate both entity descriptions and relationships that are unfaithful. To demonstrate the potential of this dataset for preference alignment, we propose fine-grained Direct Preference Optimization, as well as train fine-grained multi-modal reward models and evaluate their effectiveness with best-of-n rejection sampling. We perform human evaluation on both DPO and rejection sampling, and find that they reduce hallucination rates by 41% and 55% respectively, a significant improvement over the baseline.Comment: preprin

    Single-neuron axonal reconstruction: The search for a wiring diagram of the brain

    Full text link
    Reconstruction of the axonal projection patterns of single neurons has been an important tool for understanding both the diversity of cell types in the brain and the logic of information flow between brain regions. Innovative approaches now enable the complete reconstruction of axonal projection patterns of individual neurons with vastly increased throughput. Here, we review how advances in genetic, imaging, and computational techniques have been exploited for axonal reconstruction. We also discuss how new innovations could enable the integration of genetic and physiological information with axonal morphology for producing a census of cell types in the mammalian brain at scale.Reconstruction of the axonal projection patterns of single neurons has been an important tool for understanding both the diversity of cell types in the brain and the logic of information flow between brain regions. Innovative approaches now enable the complete reconstruction of axonal projection patterns of individual neurons with vastly increased throughput. Here, we review how advances in genetic, imaging, and computational techniques have been exploited for axonal reconstruction. We also discuss how new innovations could enable the integration of genetic and physiological information with axonal morphology for producing a census of cell types in the mammalian brain at scale.First author draf

    Masked Vision and Language Modeling for Multi-modal Representation Learning

    Full text link
    In this paper, we study how to use masked signal modeling in vision and language (V+L) representation learning. Instead of developing masked language modeling (MLM) and masked image modeling (MIM) independently, we propose to build joint masked vision and language modeling, where the masked signal of one modality is reconstructed with the help from another modality. This is motivated by the nature of image-text paired data that both of the image and the text convey almost the same information but in different formats. The masked signal reconstruction of one modality conditioned on another modality can also implicitly learn cross-modal alignment between language tokens and image patches. Our experiments on various V+L tasks show that the proposed method not only achieves state-of-the-art performances by using a large amount of data, but also outperforms the other competitors by a significant margin in the regimes of limited training data

    Factors Affecting the Outcome in Traumatic Subarachnoid Hemorrhage

    Get PDF
    Objective: To define risk factors affecting the outcome in traumatic subarachnoid hemorrhage.Material and Methods: Forty-four patients with traumatic subarachnoid hemorrhage were evaluated retrospectively. They were divided into three groups according to their age: elderly (≥65 years), adult (16- 64 years), and children (<16 years). The clinical picture on admission was evaluated using the Glasgow Coma Scale. The patients were also divided into three groups according to their coma grading on admission: mild injury (Glasgow Coma Scale score 13-15), moderate injury (8-12), and severe injury (3-7). The amount of subarachnoid blood shown in computerized tomography was evaluated according to the Fisher index, and additional tomography findings were recorded. At last follow-up, presence of headache and neurological deficits as well as return to work or school were investigated, and the last clinical picture was evaluated with the Glasgow Outcome Scale.Results: There were 11 children, 23 adults and 10 elderly patients. Twelve patients died between 1-49 days after trauma; the others were followed for a mean of 14.6 months (from 10 to 30 months). In the children group, Glasgow Coma Scale score was significantly higher (p=0.004), subarachnoid blood amount was significantly lesser, and Glasgow Outcome Scale score was significantly better compared to the other groups. For all groups, higher trauma severity on admission was associated with higher Fisher index (p=0.016). Most important factors affecting clinical results were severity of head injury on admission (p=0.0001), Fisher index (p=0.003), and presence of additional findings on computerized tomography (p=0.0001).Conclusion: Traumatic subarachnoid hemorrhage usually has a good clinical outcome in children; however, in elderly patients, the outcome is worse, and there are usually additional intracranial traumatic lesions. Most important factors affecting outcome are blood amount on first computerized tomography, head trauma severity, and presence of additional intracranial traumatic lesions

    Crowdsourcing the creation of image segmentation algorithms for connectomics

    Get PDF
    To stimulate progress in automating the reconstruction of neural circuits, we organized the first international challenge on 2D segmentation of electron microscopic (EM) images of the brain. Participants submitted boundary maps predicted for a test set of images, and were scored based on their agreement with a consensus of human expert annotations. The winning team had no prior experience with EM images, and employed a convolutional network. This “deep learning” approach has since become accepted as a standard for segmentation of EM images. The challenge has continued to accept submissions, and the best so far has resulted from cooperation between two teams. The challenge has probably saturated, as algorithms cannot progress beyond limits set by ambiguities inherent in 2D scoring and the size of the test dataset. Retrospective evaluation of the challenge scoring system reveals that it was not sufficiently robust to variations in the widths of neurite borders. We propose a solution to this problem, which should be useful for a future 3D segmentation challenge

    Fumigant toxicity of essential oil of Hypericum perforatum L., 1753 (Malpighiales: Hypericaceae) to Tenebrio molitor L., 1758 (Coleoptera: Tenebrionidae)

    No full text
    In this study, vapor of essential oil obtained by the hydrodistillation of Hypericum perforatum L., 1753 (Malpighiales: Hypericaceae) was tested on the different stages of Tenebrio molitor L., 1758 (Coleoptera: Tenebrionidae). The larvae, pupae and adult stages of T. molitor were exposed to different doses of H. perforatum essential oil for 24 h. After exposure, mortality rate, LC50, LC90 and LC99 values, antioxidant enzyme activities {[}superoxide dismutase (SOD), catalase (CAT), glutathione-S-transferase (GST) and glutathione peroxidase (GPx)], acetylcholinesterase (AChE) activity and malondialdehyde (MDA) levels were measured in the insects. Tenebrio molitor was cultured at Gazi University Department of Biology and all analyses were done in Yozgat Bozok University in 2017 and 2018. The results indicated that the pupae of T. molitor were the most tolerant and adults were the most sensitive. Mortality increased with the increasing concentration of essential oil. Also, increasing doses of essential oil caused decreasing in SOD, CAT, GST GPx and AChE activities and increasing in MDA level. These results indicate that essential oil of H. perforatum can be used against T. molitor in a pest control program

    Differential effects of clozapine and risperidone on facial emotion recognition ability in patients with treatment-resistant schizophrenia

    No full text
    Objective: Clozapine and risperidone are used for treatment-resistant schizophrenia and known to improve the positive and negative symptoms. However, there are some conflicts about effects on social cognition, which is measured with facial emotion recognition ability. The impairments in facial emotion recognition ability have frequently been in different stages of the illness and might have negative influences on psychosocial functioning. In the present study, we aimed to examine clozapine and risperidone effects recognizing facial emotions in patient with treatment-resistant schizophrenia. Methods: Thirty-four patients were screened for the study, and 19 patients were included. All patients were evaluated with Positive and Negative Syndrome Scale (PANSS), Calgary Depression Scale for Schizophrenia, and Functional Remission of General Schizophrenia Scale at baseline and after 16–20 weeks of clozapine (n = 12) or risperidone (n = 7) treatment. Furthermore, the Facial Emotion Recognition Test was performed before and after treatment. The test included the photos of four male and four female models (totally 56 mixed photos) with happy, surprised, fearful, sad, angry, disgusted, and neutral facial expressions from Ekman and Friesen’s catalog. Results: The mean dose of the index drug in clozapine group was 295.83 ± 103.26 mg/day. The mean positive (p = .002), negative (p = .050) general psychopathology (p = .002), and total score (p = .002) according to the PANSS were significantly improved after treatment. The mean dose of the index drug in risperidone group was 6.86 ± 1.57 mg/day. The mean positive symptom (p = .018) and total score (p = .041) were significantly improved after treatment but negative symptom scale (p = .396) and general psychopathology (p = .149) scores did not change. There were no significant differences between baseline and after treatment in clozapine and risperidone group according to the accuracy rate of facial emotion recognition expressions (p > .05 for each). At baseline phase, the patients were significantly impaired in recognizing disgusted faces in risperidone than those in clozapine group (p = .032) and it was significantly poorer after treatment with risperidone than with clozapine (p = .031). The patients responded significantly faster after the treatment to all facial emotions except for fearful faces (p = .355). Conclusions: Clozapine and risperidone were not found to have extensive effects on the ability to recognize facial emotions because of ineffectiveness to negative symptoms as in our study. We speculated that the higher dopaminergic receptor occupancy rate of risperidone in insular cortex than that of clozapine might be related with hypo-activation of insula that was associated with particular deficit in ability to recognize expressions of disgust in patients with schizophrenia. Impaired facial emotion recognition ability is present even in first-episode psychosis, which might be a trait marker in schizophrenia
    corecore